Overview

Dataset statistics

Number of variables40
Number of observations260601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory178.9 MiB
Average record size in memory720.0 B

Variable types

BOOL22
CAT9
NUM9

Warnings

building_id has unique values Unique
geo_level_1_id has 4011 (1.5%) zeros Zeros
age has 26041 (10.0%) zeros Zeros
count_families has 20862 (8.0%) zeros Zeros

Reproduction

Analysis started2020-12-05 14:37:31.232692
Analysis finished2020-12-05 14:38:32.139489
Duration1 minute and 0.91 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

building_id
Real number (ℝ≥0)

UNIQUE

Distinct260601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525675.4828
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-05T11:38:32.308057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52114
Q1261190
median525757
Q3789762
95-th percentile1000724
Maximum1052934
Range1052930
Interquartile range (IQR)528572

Descriptive statistics

Standard deviation304544.999
Coefficient of variation (CV)0.5793403136
Kurtosis-1.203878964
Mean525675.4828
Median Absolute Deviation (MAD)264277
Skewness0.001882356737
Sum1.369915565e+11
Variance9.274765644e+10
MonotocityNot monotonic
2020-12-05T11:38:32.453130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10526701< 0.1%
 
8473041< 0.1%
 
3681021< 0.1%
 
7299861< 0.1%
 
9005781< 0.1%
 
8964801< 0.1%
 
8084151< 0.1%
 
8125051< 0.1%
 
2902641< 0.1%
 
2697821< 0.1%
 
7258921< 0.1%
 
2656841< 0.1%
 
2759231< 0.1%
 
8411671< 0.1%
 
351821< 0.1%
 
3004871< 0.1%
 
5724011< 0.1%
 
8268221< 0.1%
 
8206771< 0.1%
 
3107221< 0.1%
 
6936171< 0.1%
 
4766051< 0.1%
 
4602131< 0.1%
 
4724991< 0.1%
 
9988341< 0.1%
 
Other values (260576)260576> 99.9%
 
ValueCountFrequency (%) 
41< 0.1%
 
81< 0.1%
 
121< 0.1%
 
161< 0.1%
 
171< 0.1%
 
251< 0.1%
 
281< 0.1%
 
311< 0.1%
 
341< 0.1%
 
361< 0.1%
 
ValueCountFrequency (%) 
10529341< 0.1%
 
10529311< 0.1%
 
10529291< 0.1%
 
10529261< 0.1%
 
10529211< 0.1%
 
10529151< 0.1%
 
10529111< 0.1%
 
10529091< 0.1%
 
10529081< 0.1%
 
10529061< 0.1%
 

geo_level_1_id
Real number (ℝ≥0)

ZEROS

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.90035341
Minimum0
Maximum30
Zeros4011
Zeros (%)1.5%
Memory size2.0 MiB
2020-12-05T11:38:32.580346image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.033616625
Coefficient of variation (CV)0.5779433361
Kurtosis-1.213248785
Mean13.90035341
Median Absolute Deviation (MAD)6
Skewness0.2725303548
Sum3622446
Variance64.53899608
MonotocityNot monotonic
2020-12-05T11:38:32.686352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
6243819.4%
 
26226158.7%
 
10220798.5%
 
17218138.4%
 
8190807.3%
 
7189947.3%
 
20172166.6%
 
21148895.7%
 
4145685.6%
 
27125324.8%
 
1396083.7%
 
1182203.2%
 
375402.9%
 
2262522.4%
 
2556242.2%
 
1643321.7%
 
040111.5%
 
939581.5%
 
1231941.2%
 
1831891.2%
 
127011.0%
 
526901.0%
 
3026861.0%
 
1523200.9%
 
1417140.7%
 
Other values (6)43951.7%
 
ValueCountFrequency (%) 
040111.5%
 
127011.0%
 
29310.4%
 
375402.9%
 
4145685.6%
 
526901.0%
 
6243819.4%
 
7189947.3%
 
8190807.3%
 
939581.5%
 
ValueCountFrequency (%) 
3026861.0%
 
293960.2%
 
282650.1%
 
27125324.8%
 
26226158.7%
 
2556242.2%
 
2413100.5%
 
2311210.4%
 
2262522.4%
 
21148895.7%
 

geo_level_2_id
Real number (ℝ≥0)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.0746851
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-12-05T11:38:32.808191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.7107336
Coefficient of variation (CV)0.5886829782
Kurtosis-1.188232475
Mean701.0746851
Median Absolute Deviation (MAD)349
Skewness0.02895738139
Sum182700764
Variance170330.1496
MonotocityNot monotonic
2020-12-05T11:38:32.932483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3940381.5%
 
15825201.0%
 
18120800.8%
 
138720400.8%
 
15718970.7%
 
36317600.7%
 
46317400.7%
 
67317040.7%
 
53316840.6%
 
88316260.6%
 
139415370.6%
 
54814970.6%
 
100614500.6%
 
72013590.5%
 
99111450.4%
 
100111350.4%
 
88911140.4%
 
76510910.4%
 
125310900.4%
 
115510690.4%
 
140110630.4%
 
88610530.4%
 
15110430.4%
 
66010410.4%
 
13110380.4%
 
Other values (1389)22178785.1%
 
ValueCountFrequency (%) 
038< 0.1%
 
12040.1%
 
377< 0.1%
 
43150.1%
 
525< 0.1%
 
62< 0.1%
 
7100< 0.1%
 
8120< 0.1%
 
93330.1%
 
103540.1%
 
ValueCountFrequency (%) 
14276< 0.1%
 
14262860.1%
 
14254660.2%
 
14247< 0.1%
 
14233< 0.1%
 
14222160.1%
 
14212540.1%
 
142010< 0.1%
 
141995< 0.1%
 
14181520.1%
 

geo_level_3_id
Real number (ℝ≥0)

Distinct11595
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6257.876148
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-12-05T11:38:33.079040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile611
Q13073
median6270
Q39412
95-th percentile11927
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3646.369645
Coefficient of variation (CV)0.5826848532
Kurtosis-1.213896506
Mean6257.876148
Median Absolute Deviation (MAD)3171
Skewness0.0003935120899
Sum1630808782
Variance13296011.59
MonotocityNot monotonic
2020-12-05T11:38:33.216221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6336510.2%
 
91336470.2%
 
6215300.2%
 
112464700.2%
 
20054660.2%
 
114404550.2%
 
77234430.2%
 
92293810.1%
 
24523490.1%
 
122583120.1%
 
82363030.1%
 
104453020.1%
 
21702830.1%
 
66262830.1%
 
25372590.1%
 
852520.1%
 
4062510.1%
 
69732480.1%
 
78682470.1%
 
39042410.1%
 
102212370.1%
 
107952360.1%
 
18512360.1%
 
113192300.1%
 
107282280.1%
 
Other values (11570)25206196.7%
 
ValueCountFrequency (%) 
02< 0.1%
 
16< 0.1%
 
39< 0.1%
 
514< 0.1%
 
621< 0.1%
 
72< 0.1%
 
831< 0.1%
 
93< 0.1%
 
101< 0.1%
 
1162< 0.1%
 
ValueCountFrequency (%) 
125671< 0.1%
 
125657< 0.1%
 
125646< 0.1%
 
1256324< 0.1%
 
125623< 0.1%
 
1256119< 0.1%
 
1256017< 0.1%
 
125596< 0.1%
 
125586< 0.1%
 
1255744< 0.1%
 

count_floors_pre_eq
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.129723217
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-05T11:38:33.336298image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7276645453
Coefficient of variation (CV)0.3416709456
Kurtosis2.322597881
Mean2.129723217
Median Absolute Deviation (MAD)0
Skewness0.8341129586
Sum555008
Variance0.5294956905
MonotocityNot monotonic
2020-12-05T11:38:33.434598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
215662360.1%
 
35561721.3%
 
14044115.5%
 
454242.1%
 
522460.9%
 
62090.1%
 
739< 0.1%
 
91< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
14044115.5%
 
215662360.1%
 
35561721.3%
 
454242.1%
 
522460.9%
 
62090.1%
 
739< 0.1%
 
81< 0.1%
 
91< 0.1%
 
ValueCountFrequency (%) 
91< 0.1%
 
81< 0.1%
 
739< 0.1%
 
62090.1%
 
522460.9%
 
454242.1%
 
35561721.3%
 
215662360.1%
 
14044115.5%
 

age
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.53502865
Minimum0
Maximum995
Zeros26041
Zeros (%)10.0%
Memory size2.0 MiB
2020-12-05T11:38:33.564240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.56593652
Coefficient of variation (CV)2.772408408
Kurtosis157.2482363
Mean26.53502865
Median Absolute Deviation (MAD)10
Skewness12.19249422
Sum6915055
Variance5411.947016
MonotocityNot monotonic
2020-12-05T11:38:33.692271image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%) 
103889614.9%
 
153601013.8%
 
53369712.9%
 
203218212.3%
 
02604110.0%
 
25243669.3%
 
30180286.9%
 
35107104.1%
 
40105594.1%
 
5072572.8%
 
4547111.8%
 
6036121.4%
 
8030551.2%
 
5520330.8%
 
7019750.8%
 
99513900.5%
 
10013640.5%
 
6511230.4%
 
9010850.4%
 
858470.3%
 
755120.2%
 
954140.2%
 
1201800.1%
 
1501420.1%
 
200106< 0.1%
 
Other values (17)3060.1%
 
ValueCountFrequency (%) 
02604110.0%
 
53369712.9%
 
103889614.9%
 
153601013.8%
 
203218212.3%
 
25243669.3%
 
30180286.9%
 
35107104.1%
 
40105594.1%
 
4547111.8%
 
ValueCountFrequency (%) 
99513900.5%
 
200106< 0.1%
 
1952< 0.1%
 
1903< 0.1%
 
1851< 0.1%
 
1807< 0.1%
 
1755< 0.1%
 
1706< 0.1%
 
1652< 0.1%
 
1606< 0.1%
 

area_percentage
Real number (ℝ≥0)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.018050583
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-05T11:38:33.830384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.392230936
Coefficient of variation (CV)0.5477928694
Kurtosis30.43825794
Mean8.018050583
Median Absolute Deviation (MAD)2
Skewness3.526082314
Sum2089512
Variance19.29169259
MonotocityNot monotonic
2020-12-05T11:38:33.973549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
64201316.1%
 
73675214.1%
 
53272412.6%
 
82844510.9%
 
9221998.5%
 
4192367.4%
 
10156136.0%
 
11139075.3%
 
3118374.5%
 
1275812.9%
 
1358152.2%
 
1441621.6%
 
1534891.3%
 
231811.2%
 
1626061.0%
 
1724891.0%
 
1916020.6%
 
1813170.5%
 
2010530.4%
 
238650.3%
 
216450.2%
 
244050.2%
 
223910.2%
 
252600.1%
 
262470.1%
 
Other values (59)17670.7%
 
ValueCountFrequency (%) 
190< 0.1%
 
231811.2%
 
3118374.5%
 
4192367.4%
 
53272412.6%
 
64201316.1%
 
73675214.1%
 
82844510.9%
 
9221998.5%
 
10156136.0%
 
ValueCountFrequency (%) 
1001< 0.1%
 
963< 0.1%
 
901< 0.1%
 
865< 0.1%
 
854< 0.1%
 
843< 0.1%
 
833< 0.1%
 
821< 0.1%
 
801< 0.1%
 
781< 0.1%
 

height_percentage
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.434365179
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-12-05T11:38:34.093313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.918418221
Coefficient of variation (CV)0.3530160667
Kurtosis14.31852616
Mean5.434365179
Median Absolute Deviation (MAD)1
Skewness1.808261757
Sum1416201
Variance3.68032847
MonotocityNot monotonic
2020-12-05T11:38:34.198175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
57851330.1%
 
64647717.8%
 
43776314.5%
 
73546513.6%
 
32595710.0%
 
8139025.3%
 
293053.6%
 
953762.1%
 
1044921.7%
 
119170.4%
 
129070.3%
 
137590.3%
 
152920.1%
 
161790.1%
 
3275< 0.1%
 
1871< 0.1%
 
1466< 0.1%
 
2033< 0.1%
 
2113< 0.1%
 
2311< 0.1%
 
179< 0.1%
 
197< 0.1%
 
244< 0.1%
 
253< 0.1%
 
262< 0.1%
 
Other values (2)3< 0.1%
 
ValueCountFrequency (%) 
293053.6%
 
32595710.0%
 
43776314.5%
 
57851330.1%
 
64647717.8%
 
73546513.6%
 
8139025.3%
 
953762.1%
 
1044921.7%
 
119170.4%
 
ValueCountFrequency (%) 
3275< 0.1%
 
311< 0.1%
 
282< 0.1%
 
262< 0.1%
 
253< 0.1%
 
244< 0.1%
 
2311< 0.1%
 
2113< 0.1%
 
2033< 0.1%
 
197< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
t
216757 
n
35528 
o
 
8316
ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 
2020-12-05T11:38:34.325203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:34.423188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:34.695120image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 

foundation_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
r
219196 
w
 
15118
u
 
14260
i
 
10579
h
 
1448
ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 
2020-12-05T11:38:34.808527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:34.889124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:35.011636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
n
182842 
q
61576 
x
 
16183
ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 
2020-12-05T11:38:35.129381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:35.215507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:35.307554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
f
209619 
x
24877 
v
24593 
z
 
1004
m
 
508
ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 
2020-12-05T11:38:35.417310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:35.498271image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:35.614540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
q
165282 
x
43448 
j
39843 
s
 
12028
ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 
2020-12-05T11:38:35.729369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:35.816589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:35.925590image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
s
202090 
t
42896 
j
 
13282
o
 
2333
ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 
2020-12-05T11:38:36.041199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:36.127233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:36.233115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
d
250072 
q
 
5692
u
 
3649
s
 
346
c
 
325
Other values (5)
 
517
ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 
2020-12-05T11:38:36.343028image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:36.422995image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:36.598332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
237500 
1
 
23101
ValueCountFrequency (%) 
023750091.1%
 
1231018.9%
 
2020-12-05T11:38:36.670357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
1
198561 
0
62040 
ValueCountFrequency (%) 
119856176.2%
 
06204023.8%
 
2020-12-05T11:38:36.721074image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251654 
1
 
8947
ValueCountFrequency (%) 
025165496.6%
 
189473.4%
 
2020-12-05T11:38:36.769187image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
255849 
1
 
4752
ValueCountFrequency (%) 
025584998.2%
 
147521.8%
 
2020-12-05T11:38:36.818183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
242840 
1
 
17761
ValueCountFrequency (%) 
024284093.2%
 
1177616.8%
 
2020-12-05T11:38:36.866220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
240986 
1
 
19615
ValueCountFrequency (%) 
024098692.5%
 
1196157.5%
 
2020-12-05T11:38:36.916279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
194151 
1
66450 
ValueCountFrequency (%) 
019415174.5%
 
16645025.5%
 
2020-12-05T11:38:36.963281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
238447 
1
 
22154
ValueCountFrequency (%) 
023844791.5%
 
1221548.5%
 
2020-12-05T11:38:37.011971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
249502 
1
 
11099
ValueCountFrequency (%) 
024950295.7%
 
1110994.3%
 
2020-12-05T11:38:37.059104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256468 
1
 
4133
ValueCountFrequency (%) 
025646898.4%
 
141331.6%
 
2020-12-05T11:38:37.107144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256696 
1
 
3905
ValueCountFrequency (%) 
025669698.5%
 
139051.5%
 
2020-12-05T11:38:37.157089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
v
250939 
a
 
5512
w
 
2677
r
 
1473
ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 
2020-12-05T11:38:37.242157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:37.321238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:37.425088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260601100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260601100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 

count_families
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9839486418
Minimum0
Maximum9
Zeros20862
Zeros (%)8.0%
Memory size2.0 MiB
2020-12-05T11:38:37.524356image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4183889779
Coefficient of variation (CV)0.425214244
Kurtosis17.67094319
Mean0.9839486418
Median Absolute Deviation (MAD)0
Skewness1.634757873
Sum256418
Variance0.1750493368
MonotocityNot monotonic
2020-12-05T11:38:37.612034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
122611586.8%
 
0208628.0%
 
2112944.3%
 
318020.7%
 
43890.1%
 
5104< 0.1%
 
622< 0.1%
 
77< 0.1%
 
94< 0.1%
 
82< 0.1%
 
ValueCountFrequency (%) 
0208628.0%
 
122611586.8%
 
2112944.3%
 
318020.7%
 
43890.1%
 
5104< 0.1%
 
622< 0.1%
 
77< 0.1%
 
82< 0.1%
 
94< 0.1%
 
ValueCountFrequency (%) 
94< 0.1%
 
82< 0.1%
 
77< 0.1%
 
622< 0.1%
 
5104< 0.1%
 
43890.1%
 
318020.7%
 
2112944.3%
 
122611586.8%
 
0208628.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
231445 
1
29156 
ValueCountFrequency (%) 
023144588.8%
 
12915611.2%
 
2020-12-05T11:38:37.685497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
243824 
1
 
16777
ValueCountFrequency (%) 
024382493.6%
 
1167776.4%
 
2020-12-05T11:38:37.732539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251838 
1
 
8763
ValueCountFrequency (%) 
025183896.6%
 
187633.4%
 
2020-12-05T11:38:37.779480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
258490 
1
 
2111
ValueCountFrequency (%) 
025849099.2%
 
121110.8%
 
2020-12-05T11:38:37.828480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260356 
1
 
245
ValueCountFrequency (%) 
026035699.9%
 
12450.1%
 
2020-12-05T11:38:37.874468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260507 
1
 
94
ValueCountFrequency (%) 
0260507> 99.9%
 
194< 0.1%
 
2020-12-05T11:38:37.921479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260322 
1
 
279
ValueCountFrequency (%) 
026032299.9%
 
12790.1%
 
2020-12-05T11:38:37.969524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260552 
1
 
49
ValueCountFrequency (%) 
0260552> 99.9%
 
149< 0.1%
 
2020-12-05T11:38:38.016167image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260563 
1
 
38
ValueCountFrequency (%) 
0260563> 99.9%
 
138< 0.1%
 
2020-12-05T11:38:38.063228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260578 
1
 
23
ValueCountFrequency (%) 
0260578> 99.9%
 
123< 0.1%
 
2020-12-05T11:38:38.110336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
259267 
1
 
1334
ValueCountFrequency (%) 
025926799.5%
 
113340.5%
 
2020-12-05T11:38:38.158472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

damage_grade
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
2
148259 
3
87218 
1
25124 
ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 
2020-12-05T11:38:38.242650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T11:38:38.319224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:38.413353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number260601100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Common260601100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII260601100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
214825956.9%
 
38721833.5%
 
1251249.6%
 

Interactions

2020-12-05T11:38:11.718282image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:11.966903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:12.179146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:12.411440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:12.599075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:12.801766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:12.999986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:13.196254image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:13.378459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:13.565322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:13.759581image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:13.948273image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:14.145061image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:14.328456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:14.525645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:14.709182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:14.897521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:15.201577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:15.383786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:15.583689image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:15.787331image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:15.995244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:16.182687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:16.383566image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:16.582675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:16.774158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:16.969942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:17.157356image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:17.354087image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:17.545238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:17.743096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:17.926512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:18.136413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:18.323588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:18.519428image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:18.702539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:18.889385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:19.090436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:19.287210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:19.488522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:19.678180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:19.886494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:20.081927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:20.280609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:20.470539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:20.655501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:20.839364image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:21.021141image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:21.208363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:21.377776image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:21.561418image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:21.735754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:21.918608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:22.099339image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:22.278686image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:22.465167image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:22.650126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:22.841584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:23.017170image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:23.207384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:23.390249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:23.571373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:23.748122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:23.932400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:24.266065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:24.457327image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:24.649149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:24.830788image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:25.023191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:25.207206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:25.392563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:25.568306image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:25.749478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:25.927237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:26.114408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:26.306285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:26.478483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:26.661417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:26.841260image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:27.020302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:27.190213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-12-05T11:38:38.545627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-05T11:38:39.087337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-05T11:38:39.620442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-05T11:38:40.164229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-05T11:38:40.714343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-05T11:38:27.956146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T11:38:30.637394image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
080290664871219823065trnfqtd11000000000v1000000000003
1288308900281221087ornxqsd01000000000v1000000000002
29494721363897321055trnfxtd01000000000v1000000000003
3590882224181069421065trnfxsd01000011000v1000000000002
420194411131148833089trnfxsd10000000000v1000000000003
53330208558608921095trnfqsd01000000000v1110000000002
672845194751206622534nrnxqsd01000000000v1000000000003
747551520323122362086twqvxsu00000110000v1000000000001
84411260757721921586trqfqsd01000010000v1000000000002
99895002688699410134tinvjsd00000100000v1000000000001

Last rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
26059156080520368598012553nrnfjsd01000000000v1110000000003
260592207683101382190322555trnfqsd01000010000v1000000000002
2605932264218767861325135trnfqsd01000000000v1110000000002
260594159555271811537601312trnfxjd00001000000v1000000000002
2605958270128268471822085trnfqsd01000000000v1000000000003
260596688636251335162115563nrnfjsq01000000000v1000000000002
2605976694851771520602065trnfqsd01000000000v1000000000003
2605986025121751816335567trqfqsd01000000000v1000000000003
26059915140926391851210146trxvsjd00000100000v1000000000002
260600747594219910131076nrnfqjd01000000000v3000000000003